Model Selection

Multimodal vision model

# Multimodal vision model

Owlv2 Large Patch14 Ensemble

OWLv2 is a zero-shot text-conditioned object detection model that can detect objects in images through text queries.

Thomasboosinger

Owlv2 Base Patch16

OWLv2 is a zero-shot text-conditioned object detection model that can detect and locate objects in images through text queries.

Owlv2 Large Patch14 Finetuned

OWLv2 is a zero-shot text-conditioned object detection model that can detect objects in images through text queries without requiring category-specific training data.

Owlv2 Base Patch16 Finetuned

OWLv2 is a zero-shot text-conditioned object detection model that can retrieve objects in images through text queries.

Object Detection

Owlvit Base Patch32

OWL-ViT is a zero-shot text-conditioned object detection model that can search for objects in images via text queries without requiring category-specific training data.

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase